Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/Fix data import from non UTF-8 files #399

Merged
merged 1 commit into from Oct 15, 2019

Conversation

c-w
Copy link
Member

@c-w c-w commented Oct 14, 2019

As reported in #88, it's currently impossible to import data from files that are not encoded in UTF-8. This pull request fixes this limitation by leveraging chardet's UniversalDetector to automatically detect the encoding of the uploaded file without buffering the entire file in memory.

Resolves #88

@Hironsan Hironsan merged commit 0d089a5 into doccano:master Oct 15, 2019
@Hironsan
Copy link
Member

Thanks!

@c-w c-w deleted the bugfix/import-non-utf8-files branch October 15, 2019 12:34
@c-w c-w mentioned this pull request Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Uploading non UTF-8 csv causes UnicodeDecodeError
2 participants